Towards Scaling Fully Personalized PageRank
نویسندگان
چکیده
Personalized PageRank expresses backlink-based page quality around user-selected pages in a similar way as PageRank expresses quality over the entire Web. Existing personalized PageRank algorithms can however serve on-line queries only for a restricted choice of page selection. In this paper we achieve full personalization by a novel algorithm that computes a compact database of simulated random walks; this database can serve arbitrary personal choices of small subsets of web pages. We prove that for a fixed error probability, the size of our database is linear in the number of web pages. We justify our estimation approach by asymptotic worst-case lower bounds; we show that exact personalized PageRank values can only be obtained from a database of quadratic size.
منابع مشابه
Towards Scaling Fully Personalized PageRank: Algorithms, Lower Bounds, and Experiments
Personalized PageRank expresses link-based page quality around userselected pages in a similar way as PageRank expresses quality over the entire web. Existing personalized PageRank algorithms can, however, serve online queries only for a restricted choice of pages. In this paper we achieve full personalization by a novel algorithm that precomputes a compact database; using this database, it can...
متن کاملSpamRank -- Fully Automatic Link Spam Detection
Spammers intend to increase the PageRank of certain spam pages by creating a large number of links pointing to them. We propose a novel method based on the concept of personalized PageRank that detects pages with an undeserved high PageRank value without the need of any kind of white or blacklists or other means of human intervention. We assume that spammed pages have a biased distribution of p...
متن کاملAn Application of Personalized PageRank Vectors: Personalized Search Engine
We introduce a tool which is an application of personalized pagerank vectors such as personalized search engines. We use pre-computed pagerank vectors to rank the search results in favor of user preferences. We describe the design and architecture of our tool. By using pre-computed personalized pagerank vectors we generate search results biased to user preferences such as top-level domain and r...
متن کاملPersonalized PageRank with Node-Dependent Restart
Personalized PageRank is an algorithm to classify the improtance of web pages on a user-dependent basis. We introduce two generalizations of Personalized PageRank with nodedependent restart. The first generalization is based on the proportion of visits to nodes before the restart, whereas the second generalization is based on the probability of visited node just before the restart. In the origi...
متن کاملTowards Supporting Exploratory Search over the Arabic Web Content: The Case of ArabXplore
Due to the huge amount of data published on the Web, the Web search process has become more difficult, and it is sometimes hard to get the expected results, especially when the users are less certain about their information needs. Several efforts have been proposed to support exploratory search on the web by using query expansion, faceted search, or supplementary information extracted from exte...
متن کامل